Dataset
s help you to organize, collect, track, and version examples for LLM application evaluation for easy comparison. You can create and interact with Dataset
s programmatically and via the UI.
This page describes:
- Basic
Dataset
operations in Python and TypeScript and how to get started - How to create a
Dataset
in Python and TypeScript from objects such as Weave calls - Available operations on a
Dataset
in the UI
Dataset
quickstart
The following code samples demonstrate how to perform fundamental Dataset
operations using Python and TypeScript. Using the SDKs, you can:
- Create a
Dataset
- Publish the
Dataset
- Retrieve the
Dataset
- Access a specific example in the
Dataset
Create a Dataset
from other objects
In Python, Then, use the
Dataset
s can also be constructed from common Weave objects like calls, and Python objects like pandas.DataFrame
s. This feature is useful if you want to create an example Dataset
from specific examples.Weave call
To create aDataset
from one or more Weave calls, retrieve the call object(s), and add them to a list in the from_calls
method.Pandas DataFrame
To create aDataset
from a Pandas DataFrame
object, use the from_pandas
method.To convert the Dataset
back, use to_pandas
.Hugging Face Datasets
To create aDataset
from a Hugging Face datasets.Dataset
or datasets.DatasetDict
object, first ensure you have the necessary dependencies installed:from_hf
method. If you provide a DatasetDict
with multiple splits (like ‘train’, ‘test’, ‘validation’), Weave will automatically use the ‘train’ split and issue a warning. If the ‘train’ split is not present, it will raise an error. You can provide a specific split directly (e.g., hf_dataset_dict['test']
).To convert a weave.Dataset
back to a Hugging Face Dataset
, use the to_hf
method.Create, edit, and delete a Dataset
in the UI
You can create, edit, and delete Dataset
s in the UI.
Create a new Dataset
- Navigate to the Weave project you want to edit.
- In the sidebar, select Traces.
-
Select one or more calls that you want to create a new
Dataset
for. - In the upper right-hand menu, click the Add selected rows to a dataset icon (located next to the trashcan icon).
- From the Choose a dataset dropdown, select Create new. The Dataset name field appears.
- In the Dataset name field, enter a name for your dataset. Options to Configure dataset fields appear. :::important Dataset names must start with a letter or number and can only contain letters, numbers, hyphens, and underscores. :::
-
(Optional) In Configure dataset fields, select the fields from your calls to include in the dataset.
- You can customize the column names for each selected field.
- You can select a subset of fields to include in the new
Dataset
, or deselect all fields.
-
Once you’ve configured the dataset fields, click Next. A preview of your new
Dataset
appears. - (Optional) Click any of the editable fields in your Dataset to edit the entry.
- Click Create dataset. Your new dataset is created.
-
In the confirmation popup, click View the dataset to view the new
Dataset
. Alternatively, go to the Datasets tab.
Edit a Dataset
-
Navigate to the Weave project containing the
Dataset
you want to edit. -
From the sidebar, select Datasets. Your available
Dataset
s display. -
In the Object column, click the name and version of the
Dataset
you want to edit. A pop-out modal showingDataset
information like name, version, author, andDataset
rows displays. -
In the upper right-hand corner of the modal, click the Edit dataset button (the pencil icon). An + Add row button displays at the bottom of the modal.
-
Click + Add row. A green row displays at the top of your existing
Dataset
rows, indicating that you can add a new row to theDataset
. -
To add data to a new row, click the desired column within that row. The default id column in a
Dataset
row cannot be edited, as Weave assigns it automatically upon creation. An editing modal appears with Text, Code, and Diff options for formatting. -
Repeat step 6 for each column that you want to add data to in the new row.
-
Repeat step 5 for each row that you want to add to the
Dataset
. -
Once you’re done editing, publish your
Dataset
by clicking Publish in the upper right-hand corner of the modal. Alternatively, if you don’t want to publish your changes, click Cancel.Once published, the new version of the
Dataset
with updated rows is available in the UI.
Delete a Dataset
-
Navigate to the Weave project containing the
Dataset
you want to edit. -
From the sidebar, select Datasets. Your available
Dataset
s display. -
In the Object column, click the name and version of the
Dataset
you want to delete. A pop-out modal showingDataset
information like name, version, author, andDataset
rows displays. -
In the upper right-hand corner of the modal, click the trashcan icon.
A pop-up modal prompting you to confirm
Dataset
deletion displays. -
In the pop-up modal, click the red Delete button to delete the
Dataset
. Alternatively, click Cancel if you don’t want to delete theDataset
. Now, theDataset
is deleted, and no longer visible in the Datasets tab in your Weave dashboard.
Add a new example to a Dataset
- Navigate to the Weave project you want to edit.
- In the sidebar, select Traces.
-
Select one or more calls with
Datasets
for which you want to create new examples. - In the upper right-hand menu, click the Add selected rows to a dataset icon (located next to the trashcan icon). Optionally, toggle Show latest versions to off to display all versions of all available datasets.
-
From the Choose a dataset dropdown, select the
Dataset
you want to add examples to. Options to Configure field mapping will display. - (Optional) In Configure field mapping, you can adjust the mapping of fields from your calls to the corresponding dataset columns.
-
Once you’ve configured field mappings, click Next. A preview of your new
Dataset
appears. - In the empty row (green), add your new example value(s). Note that the id field is not editable and is created automatically by Weave.
- Click Add to dataset. Alternatively, to return to the Configure field mapping screen, click Back.
-
In the confirmation popup, click View the dataset to see the changes. Alternatively, navigate to the Datasets tab to view the updates to your
Dataset
.
Other Dataset Operations
Selecting Rows
You can select specific rows from aDataset
by their index using the select
method. This is useful for creating subsets of your data.